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Many studies assume stock prices follow a random process known as geometric Brownian motion. 
Although approximately correct, this model fails to explain the frequent occurrence of extreme price 
movements, such as stock market crashes. Using a large collection of data from three different stock 
markets, we present evidence that a modification to the random model - adding a slow, but sig- 
nificant, fluctuation to the standard deviation of the process - accurately explains the probability 
of different-sized price changes, including the relative high frequency of extreme movements. Fur- 
thermore, we show that this process is similar across stocks so that their price fluctuations can be 
characterized by a single curve. Because the behavior of price fluctuations is rooted in the charac- 
teristics of volatility, we expect our results to bring increased interest to stochastic volatility models, 
and especially to those that can produce the properties of volatility reported here. 



INTRODUCTION 

The first theoretical study of stock prices modeled price 
differences as a simple random process - now commonly 
known as a drunkard's walk 4| . Although pioneering for 
its time, several modifications to this model have been 
needed. First was the realization that prices move in rel- 
ative amounts rather than absolute amounts, and that 
returns rather than price differences should be modeled 
as a random process 0. Next, several papers showed 
that returns could not be described by a static Gaus- 
sian process because the tails of the return distribution 
are too fat, i.e., large price fluctuations occur much too 
frequently d, Q- Numerous studies have tried to charac- 
terize and explain this phenomenon 0, S S B S EH • This 
is because understanding the probability of large returns 
is very important for asset allocation, option pricing, and 
risk management. In spite of this work, there is still 
no accepted theoretical explanation for this feature fl"lj]. 
Here we present evidence that the non-Gaussian, fat- 
tailed shape of the return distribution is explained by 
modeling returns as a random process with a slowly fluc- 
tuating standard deviation (or volatility). Previously, 
we have found that this model works well for several 
stocks traded on the London Stock Exchange (e-print 
arXiv:0906.3841). Here we test the model using a larger 
collection of stocks from different exchanges and different 
time periods. We show that the return distribution for 
these stocks is similar in shape and well-fit by the model, 
and we present evidence that the tail of the distribution 
for each stock is determined by the properties of volatility 
for that stock. 

The idea that volatility fluctuations cause non- 
Gaussian returns is not new - it was originally sug- 
gested several decades ago and is known as the mixture- 



of- distributions hypothesis [J, 12, 13, 14, 15|. This hy- 
pothesis can explain the non-Gaussian shape of the re- 
turn distribution, but it is unable to explain the appar- 
ent stability of the distribution over longer time scales. 
To account for this stability, others have suggested what 
is known as the stable Paretian hypothesis - that re- 
turns are drawn unconditionally from a fat-tailed, sta- 
ble distribution^, [1, El- Our model captures both the 
non-Gaussian shape and the apparent stability of the 
return distribution by assuming that volatility fluctua- 
tions are significant over long time scales but relatively 
small over short time scales. The model can be summa- 
rized as follows: On any single day, returns are well de- 
scribed by a Gaussian distribution. Across days, weeks, 
and months, however, slow but significant fluctuations in 
volatility produce returns with different standard devia- 
tions. When collecting returns from each of these peri- 
ods into one plot, the return distribution no longer looks 
Gaussian, but is fat-tailed. The distribution keeps this 
shape when aggregating returns over longer time scales 
because volatility is slowly varying. Because this process 
occurs in a similar way across stocks, the distribution of 
returns for different stocks collapse onto one curve. 

The results we present are produced using a large 
amount of data (of the order of 10 7 data points) from 
three stock markets over three time periods: the London 
Stock Exchange (LSE) from May 2, 2000 to December 
31, 2002, the New York Stock Exchange (NYSE) from 
January 2, 2001 to December 31, 2002, and the Span- 
ish Stock Exchange (SSE) from January 2, 2004 to De- 
cember 29, 2006. These time periods partially overlap 
for the NYSE and LSE data and are different for the 
SSE data. The time discrepancies are due to obtaining 
data from different sources, and the results we present 
appear robust over these differences. For each market, 
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we study two highly traded stocks that are from differ- 
ent market sectors: AstraZeneca (AZN) and Vodafone 
(VOD) from the LSE, International Business Machines 
(IBM) and General Motors (GM) from the NYSE, and 
Telefonica (TEF) and Banco Santander (SAN) from the 
SSE. We consider the electronic markets for these stocks 
during normal trading hours, and we measure returns 
whenever the mid-price of a stock fluctuates. This ap- 
proach allows us to study returns on the finest possible 
time scale. When aggregating returns over longer time 
scales, we use non-overlapping intervals. We measure 
price fluctuations, or returns, in the standard way[6| as 
r 4 (r) = lnp t+r — lnp t , where p is the mid-price, t is the 
time (which we update by one unit whenever the price 
changes), and r is the time increment. Because time is 
updated whenever the price changes, it is a measure of 
the number of events that have occurred and not a mea- 
sure of 'calendar' or 'clock' increments. 



ANALYSIS 

To model the features of the return distribution, we 
use a general approach that assumes a Gaussian process 
for its dynamics. The probability distribution of returns 
is therefore 17 1 
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This is coupled with a slow variation of the inverse vari- 
ance P = 1/er 2 , where a is the volatility, (r 2 (r)) = <j 2 t. 
By slow variation, we mean that /3 fluctuations are negli- 
gible compared to price fluctuations when observed over 
the time scales we study here - up to one trading day. 
This is not inconsistent with shocks to volatility as long 
as those shocks are relatively infrequent. Others have re- 
ported systematic fluctuations in intraday volatility (see 
[18| and references within), but these fluctuations closely 
mimic trading activity within the day. Because we probe 
returns over a fixed number of return causing events, fluc- 
tuations in trading activity are removed from the analy- 
sis. 

P fluctuations over time scales longer than one day 
can be characterized by a probability distribution g(P). 
Several papers have stated different functional forms for 
the distribution of volatility @, EE El ■ We propose - and 
the evidence presented here supports our assumption - 
that g(-) is similar across stocks and close to a gamma 
distribution 
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There are several simple explanations for why the inverse 
variance might have this distribution (lil. |20| . 
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FIG. 1: Collapse of the complementary cumulative dis- 
tribution (ccd) of absolute scaled returns, C(|r'|), for 
the stock IBM. The ccd is shown for times scales r — 10 
to t = 640. The solid black line is the theoretical ccd using 
Po = 1-28 x 10 7 and n = 4.40 from fitting (3 to a gamma 
distribution. Inset: ccd of the slow fluctuating variable f3 for 
IBM, the red curve is the empirical ccd and the solid black 
line is a fit to a gamma distribution. 



A straightforward integration of the conditional prob- 
ability of returns, p(r, r|/3), and the distribution g(P) 
yields the following for the return distribution: 
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which is a variant of the Student's ^-distribution. The 
non-Gaussian shape of the distribution results from col- 
lecting returns from time periods separated by long in- 
tervals where P is different. The stability of this shape 
for short to intermediate r results from negligible fluctu- 
ations of P over these time scales. 

Although it is known that a gamma distributed in- 
verse variance leads to a Student's ^-distribution for 
returns [H, 15 1, this result does not explain how the 
return distribution retains its non-Gaussian shape for 
longer time scales. To explain the persistence of the non- 
Gaussian shape, others have suggested that returns fol- 
low a fat-tailed stable distribution d, [f| [l6j]. In Eq. [3J 
we address both the non-Gaussian shape and the appar- 
ent stability of the return distribution - both result from 
the properties of volatility that we have assumed in our 
model. 

Other papers have reported that returns follow a Stu- 
dent's ^distribution and have fitted returns to a generic 
version of this distribution (see 15 1 for examples). 



In the results we present below, we do not fit a Student's 
t-distribution, but instead compare the empirical distri- 
bution to the predicted distribution as expressed in Eq. [3j 
and as determined by the independent measurement of 
Po and n. This specifically tests the model rather than 
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FIG. 2: Collapse of the return distribution on the func- 
tion /(r'), Eq. 4, for the stocks in our study. For each 
stock, the return distribution for r = 80 is shown in logarith- 
mic coordinates. Inset: The same plot in regular coordinates. 

the more general result that returns follow a Student's 
^distribution. 

RESULTS 

In Fig. 1, we show the time collapse of the comple- 
mentary cumulative distribution (ccd) of absolute scaled 
returns, C{\r'\) with r' = r- v /2/3 /(nr), for the stock IBM 
(the ccd is the integral of the probability function) . The 
ccd is plotted for r = 10 to r = 640, which is up to one 
trading day for the stocks in our study. We show this 
plot in logarithmic coordinates to focus on the tails of 
the distribution, and we overlay the plot with the ccd 
of the theoretical distribution from Eq. [3] As seen, the 
model matches the data well and the shape of the distri- 
bution is stable over these time scales. The parameters 
(3o and n are determined using a maximum likelihood fit 
of (3 to a gamma distribution, where (3 is measured once 
per day. In the inset of this figure, we show the ccd of 
j3 compared to the fit. Although not shown, these plots 
are very similar for the other stocks in our study. 

The above model assumes that the functional form of 
the return distribution is similar across stocks, and that 
differences are due to the particular properties of volatil- 
ity for each stock. This is verified in Fig. 2, where we 
show the collapse for all stocks using the following func- 
tional transformation, derived from the analytical results 
presented above: 

/(r') = [AP(r',r)p, (4) 

where A = V2tt V [n/2] /V [(n + l)/2]. Notice that Fig. 2 
shows not only the collapse of the distribution across 
stocks but also the normal transport explicitly suggested 
by Eqs. (1,3) and observed in Fig. 1. 



FIG. 3: Predicted vs. empirical tail exponent for the 
stocks under study. The tail exponent is the asymptotic 
slope of the tail of the ccd when measured in logarithmic 
coordinates. The dashed line shows y — x for comparison 
only. 

Finally, in Fig. 3, we focus on the probability of large 
returns and compare the tail of the observed distribu- 
tion to that of the predicted distribution for each stock. 
For this figure, we measure the slope of the tail of the 
empirical ccd (in logarithmic coordinates) using the Hill 
estimator [2l| on the largest five percent of the data. We 
do this for r = 10, 20, 40, 80, 160, 320 and average the re- 
sults (we do not include r = 640 because there are too 
few data points to get a reliable estimate at this time 
scale). This is compared with the slope of the tail from 
the predicted distribution in the same region. The mea- 
sured values are in good agreement with our predictions, 
showing a pronounced variation across stocks that is ex- 
plained by our model. This indicates that the likelihood 
of extreme price movements is determined by the pa- 
rameters Po and n, obtained from fitting /? to a gamma 
distribution for each stock. 



DISCUSSION 

We have presented evidence that the non-Gaussian 
shape and stable scaling of the return distribution are due 
to slow, but significant, fluctuations in volatility. Fur- 
thermore, our results suggest that return distributions 
for stocks from different exchanges, time periods, and 
over different time scales can be described by one func- 
tional form. Because we have only studied well-known 
stocks from liquid exchanges, it is unknown if this appar- 
ent universal behavior for liquid stocks will carry over to 
stocks that are infrequently traded. 

Since the behavior of price fluctuations is rooted in the 
characteristics of volatility, we expect our results to bring 
increased interest to stochastic volatility models [22j, and 
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especially to those that can produce a gamma distributed 
/jjll, EE S3, SI (also e-print |arXiv:physics/0507073| ). 
Such models can provide important insight into the fun- 
damental mechanism that underlies price fluctuations. 
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